Add Prodigy, SophiaG optimizers #1350

Kimiko-AI · 2024-03-01T16:47:47Z

Description

https://arxiv.org/pdf/2306.06101.pdf
https://arxiv.org/abs/2305.14342

Motivation and Context

How has this been tested?

No

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

winglian · 2024-03-01T20:22:37Z

src/axolotl/custom_optim/lion.py

+
+# update functions
+
+def update_fn(p, grad, exp_avg, lr, wd, beta1, beta2):


thie update_fn function is redefined below on line 106.

winglian · 2024-03-01T20:24:33Z

src/axolotl/custom_optim/sophia.py

+                           hessian: List[Tensor],
+                           state_steps: List[Tensor],
+                           *,
+                           bs: int,


bs you've set this as an int, but below you call bs.is_cuda()

Kimiko-AI · 2024-03-02T03:55:38Z

https://wandb.ai/nruaif/Test_optim?workspace=user-nruaif
Some tests, with prodigy we don't need to change lr, just set it to 1

winglian · 2024-03-06T01:10:09Z

I took a first pass at fixing the pylint errors, but mypy has still found a few issues

src/axolotl/custom_optim/sophia.py:233: error: "float" has no attribute "neg"  [attr-defined]
src/axolotl/custom_optim/lion.py:161: error: "None" not callable  [misc]
src/axolotl/custom_optim/lion.py:165: error: Invalid type 'Any' for *expr (iterable expected)  [misc]
src/axolotl/custom_optim/lion.py:165: error: Need more than 4 values to unpack (6 expected)  [misc]
src/axolotl/custom_optim/lion.py:175: error: Cannot determine type of "state"  [has-type]
src/axolotl/custom_optim/lion.py:176: error: Cannot determine type of "state"  [has-type]
src/axolotl/custom_optim/lion.py:178: error: Cannot determine type of "state"  [has-type]
src/axolotl/custom_optim/lion.py:180: error: Cannot determine type of "grad"  [has-type]
src/axolotl/custom_optim/lion.py:180: error: Cannot determine type of "lr"  [has-type]
src/axolotl/custom_optim/lion.py:180: error: Cannot determine type of "wd"  [has-type]
src/axolotl/custom_optim/lion.py:180: error: Cannot determine type of "beta1"  [has-type]
src/axolotl/custom_optim/lion.py:180: error: Cannot determine type of "beta2"  [has-type]

hammoudhasan · 2024-03-26T23:44:48Z

@Kimiko-AI I really think this pull request is worth finishing! Very useful - would love to see how Prodigy would perform on LLM training after I used it before on Stable Diffusion fine-tuning and it worked pretty well!

Kimiko-AI added 7 commits March 1, 2024 21:48

Add new optims.

f226253

Add new optims.

f426aad

Merge remote-tracking branch 'origin/main'

650f820

Fix val check

c1c1361

Merge remote-tracking branch 'origin/main'

f7f9351

Set bias correction and safeguard_warmup to false

a16079b

Test

93e95e0

winglian reviewed Mar 1, 2024

View reviewed changes

fix typo

398a94c

chore: lint

24459ee

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Prodigy, SophiaG optimizers #1350

Add Prodigy, SophiaG optimizers #1350

Kimiko-AI commented Mar 1, 2024 •

edited

Loading

winglian Mar 1, 2024

winglian Mar 1, 2024

Kimiko-AI commented Mar 2, 2024

winglian commented Mar 6, 2024

hammoudhasan commented Mar 26, 2024


		# update functions

		def update_fn(p, grad, exp_avg, lr, wd, beta1, beta2):

Add Prodigy, SophiaG optimizers #1350

Are you sure you want to change the base?

Add Prodigy, SophiaG optimizers #1350

Conversation

Kimiko-AI commented Mar 1, 2024 • edited Loading

Description

Motivation and Context

How has this been tested?

Screenshots (if appropriate)

Types of changes

Social Handles (Optional)

winglian Mar 1, 2024

Choose a reason for hiding this comment

winglian Mar 1, 2024

Choose a reason for hiding this comment

Kimiko-AI commented Mar 2, 2024

winglian commented Mar 6, 2024

hammoudhasan commented Mar 26, 2024

Kimiko-AI commented Mar 1, 2024 •

edited

Loading